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Module 4: 
OpenStack 
in production 


Mirantis, 2012 


Goals 


e Understand how to deploy OpenStack for real- 
world use 

e Be able to plan OpenStack deployment 

e Become familiar with available deployment 
tools and options 

e Practice OpenStack operation on multiple 
nodes 
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OpenStack HA 
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Transitional Failure Effects 


e Level 1 
o Failure of a component doesnt lead to permanent 
disruption of service 
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Transitional Failure Effects 


e 
O 
ө Level 2 
o Failure of a component doesnt lead to failure of new 


requests 
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Transitional Failure Effects 


е 
e 
e Level 5 


o Failure of a component doesnt lead to failure of any 
request (new or currently executing) 
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OpenStack HA: api-services 


e Services 
o nova-api 
o glance-api 
o glance-registry 
o keystone 


e Approach 


o Multiple instances with LB in front of it 
o Healthcheck (if supported by LB and API) 
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Types of Failures 


e Service instance failure 
e Machine failure 
ө Network partition 
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Surviving Service Failure 


Virtual IP 


nova-api 


? > 
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Surviving LB Failure 


Virtual IP 


Virtual IP Virtual IP 
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Software HA LB 


VIP: 192.168.56.210 
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API servers HA 
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OpenStack HA: compute services 


e Services 
о nova-compute 
o nova-network 
o nova-volume 


e Approach 


o Deploy service instance on each compute node 
o Monitor availability and try to restart automatically 
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nova database HA 


e MySOL 
o Multi-Master replication Manager 
o MySQL Cluster 


ө PostgreSQL 


o pgpooL-l middleware 
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MySQL HA: no MMM 


MIRANTIS 


sal client 
salalchem 


State DB 
(MySQL) 
Master 


State DB 
(MySQL) 
Master 


Replication 


pacemaker 
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MySQL HA: MMM 


sql client 
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PostgreSQL HA 


salalcherm 
pacemaker 


State DB 
(PostgreSQL) 
Slave 


(PostgreSQL) 
Master 
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RabbitMQ HA 


e Pacemaker for HA 
e Option 1 
о Single active RabbitMQ + 1 standby 
о Uses internal mirroring to sync queues 
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RabbitMQ HA 


e 
e 
O 
O 
e Option 2 


o Multiple RabbitMQ with mirrored queues 
о Requires nova code patch 
O Selects 1st instance which is able to response 
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RabbitMQ HA 


е 
е 
О 
О 
е 
О 
О 
О 


е Configure OpenStack to use mirrored queues (-- 
rabbit durable queues flag) 
e Configure TCP keepalive to monitor AMOP connection 
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RabbitMQ HA 


kombu client library 
—— _ 


my 


= =. 
RabbitMQ/QPid RabbitMQ/QPid 


Message queue 


Message queue 
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nova-scheduler 


e Deploy 2nd scheduler and update nova.conf 


ө 2nd scheduler will create 2 queues 
o server.<hostid> 
o server fanout id 


e if 1 scheduler dies, the consumer gets the 
message from the 2nd queue 
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nova-compute & nova-network HA 


ө nova-compute is deployed on each compute 
node, so HA is not required 

ө nova-network can be deployed on each 
compute node (multi-host mode) 
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Keystone 


ө Multiple Keystone service instances can run in 
redundant configuration behind HTTP load 


balancer 
e Virtual IP of load balancer will be entry point 


for Keystone 
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Physical deployment 
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Physical deployment 


Standard reference architecture can be 
found at referencearchitecture.org 
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Hardware nodes 
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Endpoint Node 


ө |/О and CPU intensive 
e Require best available bandwidth 
e Bonding of network interfaces is highly 


MIRANTIS 


recommended 
No specific requirements to disk 
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Controller node 


ө Use network interfaces bonding 
e Use RAID1 or RAID10 
ө Minimal HW spec 

o Single 6-core CPU 

o 8GB of RAM 

о 2х 1ТВ HDD in software RAID1 
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Compute node 


ө Use as much memory and CPU as possible 

ө Density requirements are driven by 
applications 

ө Can be run with non-redundant single disk 
(SSD recommended) configuration (if nova- 
volume 15 used) 
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Storage node 


e At least 6 disks 


о Deploy OS on redundant disk array (RAID1) 
o 4disks to be Swift storage devices 


e 2 NICs 


o 1 for internal cluster data exchange 
o 1 for uploading/downloading files 


e A lot of RAM 


o can utilize Linux cache to improve I/O throughput 
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Logical deployment architectures 
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Logical architecture 


e HW load balancer 
ө Simple controller redundancy 
e Dedicated endpoint node 
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Hardware load balancer 


Controller Node 1, 2 Compute Node 


Communication 


REST API 
ел saba ANNA 
HTTP dashboard 


Storage Node 
--- --- --- — Load balanced HTTP 
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Hardware load balancer 


ө HW load balancer instead of Endpoint node 

ө API servers are deployed on compute nodes 

ө nova-scheduler instances are deployed on 
Compute nodes 

e glance-registry instances are deployed on 
Controller nodes 
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Simple Controller redundancy 


m 
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Controller Node 1 Controller Node 2 
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Simple Controller redundancy 


e Endpoint nodes are combined with Controller 
nodes 

e API services and nova-schedulers are 
deployed on Controller nodes 

e Controller can be scaled by adding nodes and 
reconfiguring HAProxy 
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Dedicated Endpoint node 


Controller Node 1 (Active) Controller Node 2 (Standby) 


Communication 


RabbitMQ 
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Dedicated Endpoint node 


ө API services are deployed on Controller nodes 
e nova-schedulers are deployed on Controller 
nodes 
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Multi-host networking scheme 


public switch 


floating_ip 
floating_ip 
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Physical network design 
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Private network 


ө Internal network to connect VMs within 
project 

ө Segmented into separate isolated VLANs 
(single VLAN per project) 

e Routing of packets is provided by Compute 
node 
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Physical network design 


Corporate/ 

Public 
Г-------------,Қ-----------Г.-----, -----Һ----------4----- 
| 
|| 
- ето 
1 
|| 
|| 
|| 
|| 
- Сотрше Endpoint Node 
М Node Controller Node 
| 
І 
І 
І 
І 
| 
1 
I 
| Compute cluster Swift cluster 

Management 


MIRANTIS © Mirantis, Inc, 2012. АЦ rights reserved. 


Corporate/Public network 


Single C-class network 

Connects cloud clients to the cluster 

Public network 

о із accessed from Private networks via address translation 


о VMs access network via SNAT (on default gateway) 
о Floating IPs 


Corporate network 

о 15 connected to Private networks using IP routing 
mechanism 

O іп case of multi-host should have dynamic routing protocol 
(OSPF) between Compute and Corporate 


MIRANTIS © Mirantis, Inc, 2012. All rights reserved. 


Physical network design 


Corporate/ 
Public 
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MIRANTIS © Mirantis, Inc, 2012. All rights reserved. 


Management network 


ө Single Class С network from private IP 
addresses range (not globally routed) 

ө Connects services and components of 
OpenStack 

ө Should be isolated from Private/Public 
networks (for security reasons) 

ө Can be a part of Corporate network 
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Physical network design 
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Storage Network 


ө Single class С network from private IP address 
range 

ө Connects Swift storage nodes and proxy 
nodes 

ө Have to be separated from other networks 
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General monitoring considerations 


ө What to monitor 
o Physical servers availability 
o Platform services availability 
o OpenStack services availability 
o System metrics 
ө Use monitoring to implement basic self- 
healing 


O Service restart attempt is service has failed 
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Monitoring with Nagios 
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Physical nodes availability 


ө Use check host alive Nagios plugin (uses 
ICMP echo) 
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Platform services availability 


Database: check mysql or спеск раза! 
RabbitMQ: custom rabbitmq plugin 
Libvirt: check libvirt 

dnsmasq: check dhcp 
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nova services availability 


nova-api: check http plugin 
nova-scheduler: check procs plugin 
nova-compute: check procs plugin 
nova-network: check procs plugin 
Network configuration on CloudController: 
check ping plugin 
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Keystone and Glance 


e Keystone service: check http plugin 
e glance-api: check http plugin 
e glance-registry: check http plugin 
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Internal OpenStack Checks 


_check_instance_build_time If the instance has been in BUILD state for more 
time than the flag indicates, then it is set to ERROR 
state. 


—cleanup running. Cleans up any instances that are erroneously still 
ааа running after having been deleted. Cleaning 
behavior is determined by the flag. Can be: noop 


(do nothing), log, reap (shut down). 


_poll_rebooting_instances XEN Forces a reboot of the instance once it hits the 
timeout in REBOOT state 


_sync_power_states If the instance is not found on the hypervisor, but is 
in the database, then it will be set topower_state. 
NOSTATE (which is currently translated to 
PENDING by default). 
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Not Checked by OpenStack 


Floating IPs attachment 


Presence of security groups 


Bridge connectivity 


Presence of iSCSI target 


MIRANTIS 


If we flush the NAT table on the compute node, then we just lose 
connectivity to the instance to which it was pointing and 
OpenStack still reports the floating IP as “associated.” 


f we flush these rules or modify them by hand, we might 
accidentally open access holes to instances and OpenStack will 
not know about them 


If we bring a bridge down accidentally, all the VMs attached to it 
will stop responding, but they will still be reported as ACTIVE by 
OpenStack. 


If we tamper manually with the tgtadm command on the nova- 
volume node, or just happen to lose the underlying logical 
volume that holds the user’s data, OpenStack will probably not 
notice and the volume will be reported as OK. 
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Performance Tuning 
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Level of Performance Tuning 


e Host Performance 
e KVM Performance 
e Guest Performance 
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Host Performance 


e CPU 


o Change default “on-demand” scheduler to 
performance 
o Enable hyperthreads - 3026-5026 throughput increase 
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Host Performance 


е 
О 
О 
е Метогу 


о reserve huge pages and configure huge page 


filesystem (http:/Aviti.debian.org/Hugepages) 
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Host Performance 


е 
О 
О 
е 
О 
http://wiki.debian.org/Hugepages 
ө Network 


o for 10Gb NICs: мту-9000, txqueuelen=10000 
o Spread traffic over NICs and switches 
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KVM Performance 


e Memory 
o User HugeTLBfs on host and for guests (~10% boost) 
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KVM Performance 


O 


e Disk 
o Write Back Caching for NVRAM based RAID cards 
о RAID 10 for VM disks (when possible) 
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KVM Performance 


e 
О 
е 
О 
О 
е Network 


o Use virtio for bridges in nova.conf 

o Load the vhost net module on the host 

o Combination of this tweaks gives -9GB/sec on 10Gb 
switches 
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Guest Performance 


e CPU 


o Ensure “performance” scheduler 
ө Network 
o Use virtio for bridges in nova.conf 
o Load the vhost_net module on the host 


o Combination of this tweaks gives ~9GB/sec on 10Gb 
switches 
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Deployment tools overview 


MIRANTIS © Mirantis, Inc, 2012. All rights reserved. 


Deployment Tools Overview 


e ‘Toy deployment 
o DevStack 


ө Small-scale lab deployment 
o Dell Crowbar 
o Rackspace Alamo 
o StackOps 


e Large scale production deployment 
o On your own 
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Devstack Overview 


ө Pros 


o Very easy to install on a single node 

o Provides screen session with all logs and errors 
available 

o Allows to specify different network managers 


e Cons 


o Do not use it for production deployment 
o Unstable - fetches code from OpenStack master 
branch 
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Alamo Overview 


e Pros 
o Very easy to install - burn an ISO and follow 
instruction 
o Deploys Essex Stable 
o Endorsed and tested by Rackspace 


e Cons 


o 20 nodes limitation 
o NoHA 
o Only FlatDHCPManager supported 
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StackOps Overview 


e Pros 
o Deployment profiles stored online 
Rich Deployment wizard 
Single ISO for all nodes 
Supports All-in-One, Dual-Node and Multi-Node 


O O 0 


e Cons 


o Requires Internet access to deploy profiles to nodes 
о NoHA 


o Only FlatDHCPManager supported 
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Dell Crowbar Overview 


e Pros 
о Installs via ISO 
o Deploys Essex Stable 
o Configures RAID and BIOS for Dell hardware 
o Can also install Hadoop 


e Cons 


o Initial networking configuration is hard 
o NoHA 


o Only FlatDHCPManager supported 
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Dell Crowbar Overview 


e Uses OpsCode s Chef as an underlying 
deployment tool 


e Does bare-metal provisioning of nodes for 
deployment 


e For Dell reference architecture provides 


advanced capabilities 
o BIOS and RAID configuration 


Automatic configuration of Dell switches 
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Dell Crowbar 


Chef Server barclamps 


Config Files 


IP Address 


Linux Kernel | 
OpenStack | 
Packages download from recipes 
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Crowbar Primitives 


ө Barclamp 
o Extended Chef recipes, including 
m Advanced networking 
m Nagios 
m Ganglia 


e Proposal 
o Adeployment topology 
о Applied to the available nodes of the cluster 
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Deployment with Crowbar flow 


1. Deploy Crowbar admin node 
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Deployment with Crowbar flow 


1. 
2. Allocate nodes 


а. Титп оп nodes 
О. Fetch Sledgehammer (automatically) 


C. PXE boot the real system 
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Deployment with Crowbar flow 


b. 
C. 
3. Create a proposal 
а. OpenStack deployment topology 
D. Specify applicable barclamps 
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Deployment with Crowbar flow 


— 


сш 


а. 
b. 
4. Apply proposal 
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On your own 
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On your own 


ө Capacity planning 
о How many VMs per node? 
o What Is the allowed subscription? 
o What Is usage scenario? 
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On your own 


е 
е ВОМ 


o Networking gear (preferably with openflow support) 
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On your own 


е 
О 
О 
О 
е 
О 


е Deployment 
о OpenStack + Monitoring + НА 
o Integration with legacy services 
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